Decision Procedures for Composition and Equivalence of Symbolic Finite State Transducers
نویسندگان
چکیده
Finite automata model a wide array of applications in software engineering, from regular expressions to specification languages. Finite transducers are an extension of finite automata to model functions on lists of elements, which in turn have uses in fields as diverse as computational linguistics and model-based testing. Symbolic finite transducers are a further generalization of finite transducers where transitions are labeled with formulas in a given background theory. Compared to classical finite transducers, symbolic transducers are far more succinct in the case of finite alphabets, because they have no need to enumerate all cases of a transition; symbolic transducers can also use theories, such as the theory of linear arithmetic over integers or reals, with infinite alphabets. Given a decision procedure for the background theory, we show novel algorithms for composition and equivalence checking for a large class of symbolic finite transducers, namely the class of single-valued transducers. Our algorithms give rise to a complete decidable algebra of symbolic transducers. Unlike previous work, we do not need any syntactic restriction of the formulas on the transitions, only a decision procedure; in practice we leverage recent advances in satisfiability modulo theory (SMT) solvers. We show how to decide single-valuedness, which means that symbolic finite transducers arising from practice can be checked to see if our algorithms apply. Our base algorithms are unusual in that they are nonconstructive, so we exhibit a separate model generation algorithm that can quickly find counterexamples in the case two symbolic finite transducers are not equivalent. As a key example, string manipulation is a particularly important application of our theoretical results with immediate benefits to bug finding. Our work makes symbolic finite transducers a practical approach for software engineering applications, such as the analysis of security-critical sanitization functions in web pages and model based testing. Microsoft Research Technical Report MSR-TR-2011-32 March 14, 2011
منابع مشابه
Equivalence of Extended Symbolic Finite Transducers
Symbolic Finite Transducers augment classic transducers with symbolic alphabets represented as parametric theories. Such extension enables succinctness and the use of potentially infinite alphabets while preserving closure and decidability properties. Extended Symbolic Finite Transducers further extend these objects by allowing transitions to read consecutive input elements in a single step. Wh...
متن کاملSymbolic Tree Transducers
Symbolic transducers are useful in the context of web security as they form the foundation for sanitization of potentially malicious data. We define Symbolic Tree Transducers as a generalization of Regular Transducers as finite state input-output tree automata with logical constraints over a parametric background theory. We examine key closure properties of Symbolic Tree Transducers and we deve...
متن کاملFormalizing Symbolic Decision Procedures for Regular Languages
This thesis studies decision procedures for the equivalence of regular languages represented symbolically as regular expressions or logical formulas. Traditional decision procedures in this context rush to dispose of the concise symbolic representation by translating it into finite automata, which then are efficiently minimized and checked for structural equality. We develop procedures that avo...
متن کاملEquivalence of Finite-Valued Symbolic Finite Transducers
Symbolic Finite Transducers, or SFTs, is a representation of finite transducers that annotates transitions with logical formulas to denote sets of concrete transitions. This representation has practical advantages in applications for web security analysis, where it provides ways to succinctly represent web sanitizers that operate on large alphabets. More importantly, the representation is also ...
متن کاملMinimization of Symbolic Transducers
Symbolic transducers extend classical finite state transducers to infinite or large alphabets like Unicode, and are a popular tool in areas requiring reasoning over string transformations where traditional techniques do not scale. Here we develop the theory for and an algorithm for computing quotients of such transducers under indistinguishability preserving equivalence relations over states su...
متن کامل